179 research outputs found

    Fast Scalable Construction of (Minimal Perfect Hash) Functions

    Full text link
    Recent advances in random linear systems on finite fields have paved the way for the construction of constant-time data structures representing static functions and minimal perfect hash functions using less space with respect to existing techniques. The main obstruction for any practical application of these results is the cubic-time Gaussian elimination required to solve these linear systems: despite they can be made very small, the computation is still too slow to be feasible. In this paper we describe in detail a number of heuristics and programming techniques to speed up the resolution of these systems by several orders of magnitude, making the overall construction competitive with the standard and widely used MWHC technique, which is based on hypergraph peeling. In particular, we introduce broadword programming techniques for fast equation manipulation and a lazy Gaussian elimination algorithm. We also describe a number of technical improvements to the data structure which further reduce space usage and improve lookup speed. Our implementation of these techniques yields a minimal perfect hash function data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based ones, and a static function data structure which reduces the multiplicative overhead from 1.23 to 1.03

    On the probability of rendezvous in graphs

    No full text
    In a simple graph GG without isolated nodes the following random experiment is carried out: each node chooses one of its neighbors uniformly at random. We say a rendezvous occurs if there are adjacent nodes uu and vv such that uu chooses vv and vv chooses uu; the probability that this happens is denoted by s(G)s(G). M{\'e}tivier \emph{et al.} (2000) asked whether it is true that s(G)s(Kn)s(G)\ge s(K_n) for all nn-node graphs GG, where KnK_n is the complete graph on nn nodes. We show that this is the case. Moreover, we show that evaluating s(G)s(G) for a given graph GG is a \numberP-complete problem, even if only dd-regular graphs are considered, for any d5d\ge5

    Quicksort, Largest Bucket, and Min-Wise Hashing with Limited Independence

    Get PDF
    Randomized algorithms and data structures are often analyzed under the assumption of access to a perfect source of randomness. The most fundamental metric used to measure how "random" a hash function or a random number generator is, is its independence: a sequence of random variables is said to be kk-independent if every variable is uniform and every size kk subset is independent. In this paper we consider three classic algorithms under limited independence. We provide new bounds for randomized quicksort, min-wise hashing and largest bucket size under limited independence. Our results can be summarized as follows. -Randomized quicksort. When pivot elements are computed using a 55-independent hash function, Karloff and Raghavan, J.ACM'93 showed O(nlogn)O ( n \log n) expected worst-case running time for a special version of quicksort. We improve upon this, showing that the same running time is achieved with only 44-independence. -Min-wise hashing. For a set AA, consider the probability of a particular element being mapped to the smallest hash value. It is known that 55-independence implies the optimal probability O(1/n)O (1 /n). Broder et al., STOC'98 showed that 22-independence implies it is O(1/A)O(1 / \sqrt{|A|}). We show a matching lower bound as well as new tight bounds for 33- and 44-independent hash functions. -Largest bucket. We consider the case where nn balls are distributed to nn buckets using a kk-independent hash function and analyze the largest bucket size. Alon et. al, STOC'97 showed that there exists a 22-independent hash function implying a bucket of size Ω(n1/2)\Omega ( n^{1/2}). We generalize the bound, providing a kk-independent family of functions that imply size Ω(n1/k)\Omega ( n^{1/k}).Comment: Submitted to ICALP 201

    Improved Approximate String Matching and Regular Expression Matching on Ziv-Lempel Compressed Texts

    Full text link
    We study the approximate string matching and regular expression matching problem for the case when the text to be searched is compressed with the Ziv-Lempel adaptive dictionary compression schemes. We present a time-space trade-off that leads to algorithms improving the previously known complexities for both problems. In particular, we significantly improve the space bounds, which in practical applications are likely to be a bottleneck

    A Reconfigurations Analogue of Brooks’ Theorem

    Get PDF
    Let G be a simple undirected graph on n vertices with maximum degree Δ. Brooks’ Theorem states that G has a Δ-colouring unless G is a complete graph, or a cycle with an odd number of vertices. To recolour G is to obtain a new proper colouring by changing the colour of one vertex. We show that from a k-colouring, k > Δ, a Δ-colouring of G can be obtained by a sequence of O(n 2) recolourings using only the original k colours unless G is a complete graph or a cycle with an odd number of vertices, or k = Δ + 1, G is Δ-regular and, for each vertex v in G, no two neighbours of v are coloured alike. We use this result to study the reconfiguration graph R k (G) of the k-colourings of G. The vertex set of R k (G) is the set of all possible k-colourings of G and two colourings are adjacent if they differ on exactly one vertex. It is known that if k ≤ Δ(G), then R k (G) might not be connected and it is possible that its connected components have superpolynomial diameter, if k ≥ Δ(G) + 2, then R k (G) is connected and has diameter O(n 2). We complete this structural classification by settling the missing case: if k = Δ(G) + 1, then R k (G) consists of isolated vertices and at most one further component which has diameter O(n 2). We also describe completely the computational complexity classification of the problem of deciding whether two k-colourings of a graph G of maximum degree Δ belong to the same component of R k (G) by settling the case k = Δ(G) + 1. The problem is O(n 2) time solvable for k = 3, PSPACE-complete for 4 ≤ k ≤ Δ(G), O(n) time solvable for k = Δ(G) + 1, O(1) time solvable for k ≥ Δ(G) + 2 (the answer is always yes)

    Knocking Out P_k-free Graphs

    Get PDF
    A parallel knock-out scheme for a graph proceeds in rounds in each of which each surviving vertex eliminates one of its surviving neighbours. A graph is KO-reducible if there exists such a scheme that eliminates every vertex in the graph. The Parallel Knock-Out problem is to decide whether a graph G is KO-reducible. This problem is known to be NP-complete and has been studied for several graph classes since MFCS 2004. We show that the problem is NP-complete even for split graphs, a subclass of P 5-free graphs. In contrast, our main result is that it is linear-time solvable for P 4-free graphs (cographs)

    Matchings on infinite graphs

    Full text link
    Elek and Lippner (2010) showed that the convergence of a sequence of bounded-degree graphs implies the existence of a limit for the proportion of vertices covered by a maximum matching. We provide a characterization of the limiting parameter via a local recursion defined directly on the limit of the graph sequence. Interestingly, the recursion may admit multiple solutions, implying non-trivial long-range dependencies between the covered vertices. We overcome this lack of correlation decay by introducing a perturbative parameter (temperature), which we let progressively go to zero. This allows us to uniquely identify the correct solution. In the important case where the graph limit is a unimodular Galton-Watson tree, the recursion simplifies into a distributional equation that can be solved explicitly, leading to a new asymptotic formula that considerably extends the well-known one by Karp and Sipser for Erd\"os-R\'enyi random graphs.Comment: 23 page

    Forbidden Induced Subgraphs and the Price of Connectivity for Feedback Vertex Set

    Get PDF
    Let fvs(G) and cfvs(G) denote the cardinalities of a minimum feedback vertex set and a minimum connected feedback vertex set of a graph G, respectively. For a graph class G, the price of connectivity for feedback vertex set (poc-fvs) for G is defined as the maximum ratio cfvs(G)/fvs(G) over all connected graphs G in G. It is known that the poc-fvs for general graphs is unbounded. We study the poc-fvs for graph classes defined by a finite family H of forbidden induced subgraphs. We characterize exactly those finite families H for which the poc-fvs for H-free graphs is bounded by a constant. Prior to our work, such a result was only known for the case where |H|=1

    Wear Minimization for Cuckoo Hashing: How Not to Throw a Lot of Eggs into One Basket

    Full text link
    We study wear-leveling techniques for cuckoo hashing, showing that it is possible to achieve a memory wear bound of loglogn+O(1)\log\log n+O(1) after the insertion of nn items into a table of size CnCn for a suitable constant CC using cuckoo hashing. Moreover, we study our cuckoo hashing method empirically, showing that it significantly improves on the memory wear performance for classic cuckoo hashing and linear probing in practice.Comment: 13 pages, 1 table, 7 figures; to appear at the 13th Symposium on Experimental Algorithms (SEA 2014
    corecore